AITopics | cold posterior effect

The "cold posterior effect" (CPE) in Bayesian deep learning describes the disturbing observation that the predictive performance of Bayesian neural networks can be significantly improved if the Bayes posterior is artificially sharpened using a temperature parameter T <1. The CPE is problematic in theory and practice and since the effect was identified many researchers have proposed hypotheses to explain the phenomenon. However, despite this intensive research effort the effect remains poorly understood. In this work we provide novel and nuanced evidence relevant to existing explanations for the cold posterior effect, disentangling three hypotheses: 1. The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength.

disentangling, hypothesis, name change, (8 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

A Further Related Work

Neural Information Processing SystemsOct-9-2025, 15:49:29 GMT

Motivated by the behavior of Bayesian inference in misspecified models Grün-wald et al. ( 2017); Jansen ( 2013) extensively studied the so called "generalized" Bayesian inference, However, these works consider only "warm posteriors" Grünwald et al. ( 2017) the prior favours simple models, hence it is beneficial to put more weight onto the prior and use warm posterior. Finally, we mention the work of Bhattacharya et al. ( 2019), in which the authors develop fractional posteriors with the goal of CIFAR-10, have been collected and curated. The Street View House Numbers dataset ( Netzer et al., 2011), which is divided into a training In CIFAR-10 ( Krizhevsky and Hinton, 2009), labellers followed strict guidelines to ensure high quality labelling of the images. In particular, labellers were instructed that "it's worse to include one that shouldn't be included than to exclude one. In this section we review the basics of (SG)-MCMC inference.

data augmentation, experiment, labeller, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect Lorenzo Noci Dept of Computer Science ETH Zürich lorenzo.noci@inf.ethz.ch Kevin Roth

Neural Information Processing SystemsOct-9-2025, 15:49:25 GMT

The " cold posterior effect " (CPE) in Bayesian deep learning describes the uncom-forting observation that the predictive performance of Bayesian neural networks

cpe, data augmentation, posterior, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.41)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.47)

Add feedback

Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning

van der Vaart, Pascal R., Yorke-Smith, Neil, Spaan, Matthijs T. J.

arXiv.org Artificial IntelligenceSep-1-2025

Uncertainty quantification in reinforcement learning can greatly improve exploration and robustness. Approximate Bayesian approaches have recently been popularized to quantify uncertainty in model-free algorithms. However, so far the focus has been on improving the accuracy of the posterior approximation, instead of studying the accuracy of the prior and likelihood assumptions underlying the posterior. In this work, we demonstrate that there is a cold posterior effect in Bayesian deep Q-learning, where contrary to theory, performance increases when reducing the temperature of the posterior. To identify and overcome likely causes, we challenge common assumptions made on the likelihood and priors in Bayesian model-free algorithms. We empirically study prior distributions and show through statistical tests that the common Gaussian likelihood assumption is frequently violated. We argue that developing more suitable likelihoods and priors should be a key focus in future Bayesian reinforcement learning research and we offer simple, implementable solutions for better priors in deep Q-learning that lead to more performant Bayesian algorithms.

likelihood, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2508.21488

Country: Europe > Austria (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Appendix for On Uncertainty, Tempering, and Data Augmentation in Bayesian Classification

Neural Information Processing SystemsAug-15-2025, 22:15:35 GMT

Overall, properly representing aleatoric uncertainty is a challenging but fundamentally important consideration in Bayesian classification. We have shown that posterior tempering provides a mechanism to more honestly represent our beliefs about aleatoric uncertainty, especially in the presence of data augmentation. In general, as in Wilson and Izmailov [ 62 ], we should not be alarmed if T =1 is not optimal in sophisticated models on complex real-world datasets. Moreover, we have shown how other mechanisms to represent aleatoric uncertainty, such as the noisy Dirichlet model, 17 do not suffer from a cold posterior effect in the presence of data augmentation. Indeed, while an interesting phenomenon, cold posteriors should not be conflated with the success or failure of Bayesian deep learning.

artificial intelligence, likelihood, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

73e018a0123b35a3e64269526f9096c9-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 22:15:31 GMT

aleatoric uncertainty, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China (0.04)

Add feedback

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

Neural Information Processing SystemsOct-10-2024, 23:13:50 GMT

The "cold posterior effect" (CPE) in Bayesian deep learning describes the disturbing observation that the predictive performance of Bayesian neural networks can be significantly improved if the Bayes posterior is artificially sharpened using a temperature parameter T 1. The CPE is problematic in theory and practice and since the effect was identified many researchers have proposed hypotheses to explain the phenomenon. However, despite this intensive research effort the effect remains poorly understood. In this work we provide novel and nuanced evidence relevant to existing explanations for the cold posterior effect, disentangling three hypotheses: 1. The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength. Our results demonstrate how the CPE can arise in isolation from synthetic curation, data augmentation, and bad priors.

cold posterior effect, disentangling, hypothesis, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Cold Posteriors through PAC-Bayes

Pitas, Konstantinos, Arbel, Julyan

arXiv.org Machine LearningJun-22-2022

We investigate the cold posterior effect through the lens of PAC-Bayes generalization bounds. We argue that in the non-asymptotic setting, when the number of training samples is (relatively) small, discussions of the cold posterior effect should take into account that approximate Bayesian inference does not readily provide guarantees of performance on out-of-sample data. Instead, out-of-sample error is better described through a generalization bound. In this context, we explore the connections between the ELBO objective from variational inference and the PAC-Bayes objectives. We note that, while the ELBO and PAC-Bayes objectives are similar, the latter objectives naturally contain a temperature parameter $\lambda$ which is not restricted to be $\lambda=1$. For both regression and classification tasks, in the case of isotropic Laplace approximations to the posterior, we show how this PAC-Bayesian interpretation of the temperature parameter captures the cold posterior effect.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2206.11173

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Filters

Collaborating Authors

cold posterior effect

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

6a12d7ebc27cae44623468302c47ad74-Supplemental.pdf

6a12d7ebc27cae44623468302c47ad74-Paper.pdf

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

A Further Related Work

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect Lorenzo Noci Dept of Computer Science ETH Zürich lorenzo.noci@inf.ethz.ch Kevin Roth

Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning

Appendix for On Uncertainty, Tempering, and Data Augmentation in Bayesian Classification

73e018a0123b35a3e64269526f9096c9-Paper-Conference.pdf

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

Cold Posteriors through PAC-Bayes